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Abstract 

Large in the number of transmit elements, multi-antenna arrays with per-element limitations are in 
the focus of the present work. In this context, physical layer multigroup multicasting under per-antenna 
power constrains, is investigated herein. To address this complex optimization problem low-complexity 
alternatives to semi-definite relaxation are proposed. The goal is to optimize the per-antenna power 
constrained transmitter in a maximum fairness sense, which is formulated as a non-convex quadratically 
constrained quadratic problem. Therefore, the recently developed tool of feasible point pursuit and 
successive convex approximation is extended to account for practical per-antenna power constraints. 
Interestingly, the novel iterative method exhibits not only superior performance in terms of approaching 
the relaxed upper bound but also a significant complexity reduction, as the dimensions of the optimization 
variables increase. Consequently, multicast multigroup beamforming for large-scale array transmitters 
with per-antenna dedicated amplifiers is rendered computationally efficient and accurate. A preliminary 
performance evaluation in large-scale systems for which the semi-definite relaxation constantly yields 
non rank-1 solutions is presented. 
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I. Introduction & Related Work 

Highly demanding applications (e.g. video broadcasting) stretch the throughput limits of multiuser 
broadband systems. To provide for such requirements, the adaptation of the physical layer design of next 
generation multi-antenna wireless communication systems to the needs of the higher network layers is 
imminent. In this direction, physical layer (PHY) multicasting has the potential to efficiently address the 
nature of future traffic demand and has become part of the new generation of communication standards. 
In-line with the recent trends for spectrally efficient massive multiple input multiple output (MIMO) 
wireless systems HI, the topic of multicasting over large-scale antenna arrays arises. A brief review of 
the state-of-the art in multicasting follows. 

A. PHY Multicasting 

The NP-hard multicast problem was defined and accurately approximated by semi-definite relaxation 
(SDR) and Gaussian randomization in m. Extending the multicast concept, a unified framework for 
physical layer multicasting to multiple co-channel groups, where independent sets of common data are 
transmitted to groups of users by the multiple antennas, was given in ||3l, 0. In parallel to 111, the 
work of l|5l involved dirty paper coding methods that are bound to increase the complexity of the 
system. Next, a convex approximation method for the max min/a/r optimization was proposed in Q, 
exhibiting increased performance as the number of users per group grows, but for relatively low numbers 
of transmit antennas. In the same context, a similar iterative convex approximation method, this time 
for the total power minimization under quality-of-service (QoS) constraints formulation, was considered 
in Q. In this case, the conservative convex approximation of iH was employed and a channel phase 
based, user scheduling method was performed as a second step towards increasing the tightness of the 
approximation. Finally, in Q, the multicast multigroup problem, was solved based on approximations 
and uplink-downlink duality. 

The hitherto reviewed literature on multigroup multicast beamforming has only considered sum-power 
constraints (SPCs) at the transmitter side. Amid this extensive literature, the optimal multigroup multicast 
precoders when a maximum limit is imposed on the transmitted power of each antenna, have only 
recently been derived in lITOl . ifTTI . Therein, a consolidated solution for the weighted max-min fair 
multigroup multicast beamforming problem under per-antenna constraints (PACs) is presented. This 
framework is based on SDR and Gaussian randomization to solve the QoS problem and bisection to 
derive an accurate approximation of the non-convex max min/a/r formulation. However, as detailed in 
Km, the PACs are bound to increase the complexity of the optimization problem and reduce the accuracy 
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of the approximation, especially as the number of transmit antennas is increasing. These observations 
necessitate the investigation of lower complexity, accurate approximations that can be applied on large- 
scale antenna arrays, constrained by practical, per-antenna power limitations. 

B. Successive Convex Approximation 

Inspired by the recent development of the feasible point pursuit (FPP) successive convex approximation 
(SCA) of non-convex quadratically constrained quadratic problems (QCQPs), as developed in ifT^ . the 
present work aims at improving the maxmin/a/r solutions of ifTTI . The FPP — SCA tool has been 
preferred over other existing approximations (for instance |[T3l l due to its guaranteed feasibility regardless 
of the initial state of the iterative optimization ifT^ . 

The rest of the paper is structured as follows. The generic per-antenna power constrained multicast 
multigroup system model is presented in Sec. JI] while the max min problem is formulated and solved 
in Sec. mil In Sec. |IVl the performance of the design is evaluated for a specific system setup. Finally, 
Sec. |V] concludes the paper. 

Notation: In the remainder of this paper, bold face lower case and upper case characters denote column 
vectors and matrices, respectively. The operators (•)^, | • | and (g) correspond to the conjugate transpose, 
the absolute value and the Kronecker product respectively, while [-jij denotes the i,j-th element of a 
matrix. An identity matrix of x dimensions is denoted as Ijv and its k-th column as e^. Calligraphic 
indexed characters denote sets. denotes the set of real positive M-dimensional vectors. 

II. System Model 

Assuming a single transmitter, let Nt denote the number of transmitting elements and the total 
number of users served. The input-output analytical expression will read as pi = h|x -|- n*, where h| 
is a 1 X At vector composed of the channel coefficients (i.e. channel gains and phases) between the 
z-th user and the Nt antennas of the transmitter, x is the At x 1 vector of the transmitted symbols and 
Ui is the independent complex circular symmetric (c.c.s.) independent identically distributed (i.i.d) zero 
mean Additive White Gaussian Noise (AWGN) measured at the i-th user’s receive antenna. Focusing 
on a multigroup multicasting scenario, let there be a total of 1 < G < A„ multicast groups with I = 
{Gi,G 2 , ■ ■ - Gg} the collection of index sets and the set of users that belong to the fe-th multicast group, 
A: € {1 ... G}. Each user belongs to only one group, thus Qt n Gj =0,Vi, j € {1 • • • G}. Let G C^‘Xl 
denote the precoding weight vector applied to the transmit antennas to beamform towards the k-th group. 
The assumption of independent data transmitted to different groups renders the symbol streams {sfc}^=i 
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mutually uncorrelated and the total power radiated from the antenna array is Ptot 
power radiated hy each antenna element is a linear comhination of all precoders Pn 
where n G { 1 ... is the antenna index. 

III. Multicast Multigroup under PACs 

A. SDR Based Solution 

I) Max-Min Fair Formulation: 


( 1 ) 

( 2 ) 


where G C^* and t G M"*". The notation states that aggregate interference from all co-channel 

groups is calculated. Problem P receives as inputs the PACs vector p = [Pi, P 2 • • • -P/Vt] ^nd the target 
SINRs vector g = [ 71 , 72 ,... 7 Ar„]. Its goal is to maximize the slack variable t while keeping all SINRs 
above this value. Thus, it constitutes a max-min problem that guarantees fairness amongst users. The 
main complication of problem P lies in constraint ([T|), where a multiplication of the two optimization 
variables takes place. To reduce this formulation into the more tractable QCQP form, the following 
considerations are emanated. 

2) Per-antenna Power Minimization: A relation between the fairness and the power minimization 
problems for the multicast multigroup case under SPCs was firstly established in As a result, by 
bisecting the solution of the QoS optimization, a solution to the weighted fairness problem can be derived. 
Nevertheless, fundamental differences between the SPC formulation and the PAC problem P, complicate 
the solution. In more detail, the PACs -i.e dUl- are not necessarily met with equality. A more detailed 
discussion on this can be found in ifTTI . Therefore, a per-antenna power minimization problem has been 


P: max t 
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proposed in ifTTI . as 


(3) 

(4) 


with r G M'*'. Problem Q receives as input SINK constraints for all users, defined before as g, as well as 
the per antenna power constraint vector p of The introduction of the slack-variable r, constraints the 
power consumption of each and every antenna. Subsequently, at the optimum r*, the maximum power 
consumption out of all antennas is minimized and this solution is denoted as r* = Q(g, p). 

Claim 1: Problems T and Q are related as follows 

1 = Q(J'(g,p) •g,p) (5) 

t = T{s,Q{t-s,p)-p) (6) 

(for proof cf. ifTTl l ■ 

3) Bisection: The establishment of claim 1 allows for the application of the bisection method, as 

developed in 13, 0- The solution of r* = Qr is obtained by bisecting the interval [L,U] 

as defined by fhe minimum and maximum SINK values. Since t = {L + U)/2 represents the SINK, 
it will always be positive or zero. Thus, L = 0. Also, if the system was interference free while all 
the users had the channel of the best user, then the maximum worst SINK would be attained, thus 
U = maxi{PtotQi/<7i}. If r* < 1 , then the lower bound of the interval is updated with this value. 
Otherwise the value is assigned to the upper bound of the interval. Bisection is iteratively performed 
until an the interval size is reduced to a pre-specified value e (herein, e = 10“^). This value needs to 
be dependent on the magnitude of L and U so that the accuracy of the solution is maintained regardless 
of the region of operation. After a finite number of iterations, the optimal value of T is given as the 
resulting value for which L and U become almost identical, providing an accurate solution for T. 

4) Relaxation and Gaussian Randomization: The bisection method, as previously discussed, over¬ 
comes the non-convexity due to the multiplication of two variables, namely t and w in constraint ([T]). 
However, problem Q still remains non-convex. Based on the observation that |w|,hjp = w|.hjh|wfc = 


Q : min r 
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Tr(w|,hjh|wfc) = Tr(wfcw|,hjh|) and with the change of variables X* = w^wj, one can easily identify 
that the non-convexity of Q lies in the necessity to constrain variable X to have a unit rank. By dropping 
this constraint, the non-convex Q can be relaxed to Qr, which reads as 


(7) 


( 8 ) 

(9) 

Following this relaxation, the derivation of the optimal value w* requires a rank-1 approximation over 
X*. The approximation with the highest accuracy is proven to be the Gaussian approximation ifldll . In 
summary, this procedure involves the generation of precoding vectors drawn from a Gaussian distribution 
with statistics defined by the relaxed solution. After generating a a number of instances and re-scaling 
them, the solution with the closest performance to the relaxed upper bound, as given by the optimal point 
of Qr is chosen. More details on the SDR based solution under PACs, can be found in ifTTI . 

B. Successive Convex Approximation 

Problem Q belongs in the general class of non-convex QCQPs for which the SDR technique is proven 
to be a powerful and computationally efficient approximation technique |[T4ll . However, the FPP — SC A, 
a recently proposed alternative to SDR, is herein considered ifT^ . By defining xvtot = [w|, wj ... w^]^, 
the i-th SINR constraint reads as 

wLAjWtot < -jiaf, (10) 

where A* = A^"^^ -f a|“^ with = 7 * (Ig - diagjefc}) (g) hjh| and A,[“^ = -diagjefc} C) hjh| , 

Vi € Gk- Assuming a random point z, then by the definition of a semi-definite matrix A^ ^ we have 
(wtof — z)^ Aj^ ^ (wtof — z) < 0. By expanding this, a linear restriction of wtot around z reads as 

< 2Re |z'fA^"^Wtot| - z^^A^^^z. (11) 


Qr : min r 
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Consequently, the SINK constraint (ITOl) can be replaced by 

+ 2Re < -'jiaf, 

in which the unknown variables are quadratic over a semi-definite matrix. By adding slack penalties 
s e Kjv„-ri)’ the the original QCQP problem Q can be approximated by 


( 12 ) 


(13) 


where r G R+, A G M is a fixed inpuf paramefer and z^-^) is fhe y—fh insfance of fhe infroduced auxiliary 
variable. In each insfance of fhe SCA algorifhm, QscA is solved and fhe sfarfing poinf is updafed as 
z(7+i) = w£|. The iferafive process is repealed until fhe guaranleed convergence ifT^ . 

C. Complexity & Convergence discussions 

An imporlanl discussion involves fhe complexify of fhe employed lechniques lo approximate a solution 
of fhe highly complex, NP-hard mulligroup mullicasl problem under PACs. Focusing on fhe SDR based 
solulion of m, fhe main complexify burden originales from fhe relaxed Qr. The lolal worsl case 
complexify of fhe SDR based solution of IF, as in delail is calculaled in ifTTl . is summarised in fhe 
following. Initially, a biseclion search is performed over Qr fo oblain fhe relaxed solufion. This bisection 
runs for Nuer = riog 2 — Li) /ei] where ei is fhe desired accuracy of fhe search. Typically ei 
needs lo be al leasl Ihree orders of magnilude below fhe magniludes of Ui,Li for sufficienl accuracy. 
In each iteralion of fhe bisection search, problem Qr is solved. This SDP has G matrix variables 
of Nt X Nt dimensions and Nu + Nt linear constraints. Moreover, in each iteration not more than 
0[G^Nf + GNf + NuGNf) arithmetic operations will be performed. Next, a fixed number of Gaussian 
random inslances wilh covariance given by fhe previous solulion are generated. The complexify of Ibis 
process is linear wilh respecl lo fhe number of Gaussian randomizalions. More delails on fhe lolal 
complexify of fhe SDR based algorifhm can be found in ifTTI and are herein omitted for shortness. 


Qsca- min r + A||s| 

r,Wtot,s 


s.l. + 2Re ^'w^fj 
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As far as the FPP — SCA method is concerned, the iterative process typically runs for a few iterations, 
especially for larger values of A. As in detail explained in ifT^ . convergence is guaranteed. Therein, A 
was set to 10 while herein even greater values are chosen, i.e. A = 25 since the optimization problems 
tackled involve a larger number of constraints. Therefore, In each iteration of the FPP — SCA, bisection 
search is performed over Qsac- The later, is a second order cone program with a worst case complexity 
of 0{{GNt + Nu)^'^). The later fact justifies the user of the FPP — SCA in scenarios where the number 
of transmit antennas exceeds the number of users. 

IV. Performance Evaluation & Applications 
A. Uniform Linear Arrays 

To the end of investigating the sensitivity of the proposed algorithm in a generic environment, a 
uniform linear array (ULA) transmitter is considered. Assuming far-held, line-of-sight conditions, the 
user channels can be modeled using Vandermonde matrices. For this important special case, the SPC 
multicast multigroup problem was reformulated into a convex optimization problem and solved in ifTSl . 
m- These results where motivated by the observation that in sum power constrained ULA scenarios, 
the relaxation consistently yields rank one solutions. Thus, for such cases, the SDR is essentially optimal 
II 2 I . Nevertheless, the SDR of the PAC minimization problem in ULAs is not always tight as shown in 

m. 

Let us consider a ULA serving 4 users allocated to 2 distinct groups. In Fig. [T] its radiation pattern 
for Nt = 8 antennas and for co-group angular separation 9a = 35° is plotted. A total power budget of 
P = —3 dBW is equally distributed amongst the available antennas. For the Gaussian randomization, 
Nj-and = 100 instances are considered. Clearly, the multigroup multicast beamforming optimizes the lobes 
to reduce interferences between the two groups. The beam patterns from both SDR and FPP — SCA 
solutions are included in Fig. [T] The superiority in terms of minimum achievable SINR of the latter 
solution is apparent. Hereafter, the performance evaluation will be based on the minimum user rate, since 
in the optimization all users are equally weighted. 

Firstly, the performance with respect to the angular separation of co-group users is investigated, as 9a is 
increased for both groups in the fashion indicated in Fig. [T] In Fig. |2l when co-group users are collocated, 
i.e. 9a = 0°, the highest minimum rate is attained. As the separation increases, the rate is reduced reaching 
a local minimum when interfering users are placed in the same position, i.e. 9a = 45°. Then, the lowest 
value is observed when co-group users are orthogonal, i.e. 9a = 90°. In Fig. |2l the lack of tightness 
of the relaxation for the SDR based solution is clear as the channel conditions are deteriorating. The 
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Fig. 1. ULA beampattern for PAC and re-scaled SPC solutions. 


only exception is when 6a = 60°, where the inherent symmetricity of the ULA transmitter is providing 
sufficient conditions for a rank-1 solution to he easily obtained. Interestingly, this is the only situation 
where the FPP — SCA method provides a suhoptimal solution. For all other instances, the superiority 
of the lower complexity solution is clear. Consequently, the FPP — SCA outperforms SDR, over the 
majority of the span of the angular separations, for moderately sized ULAs. In the same setting, the 
normalized simulation time to compute each precoder is given in Fig. Clearly, when the SDR does 
not yield rank-1 solutions, the FPP — SCA methods can not only provide more accurate solutions hut 
also at a significantly reduced time. Almost 50% of gains in terms of simulation time are observed at 

6a = 80°. 

Finally, for an angular separation of 6a = 60° where the FPP — SCA solution performs worse, the 
minimum rate versus an increasing number of transmit antennas is plotted in Fig. HI while all other 
simulation parameters remain unaltered. Therein, the benefits of FPP — SCA as the number of antennas 
is increasing are shown. The SDR solution, fails to provide an accurate solution from 10 antennas onwards. 
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Fig. 2. ULA performance in terms of minimum SINK per group, for increasing co-group user angular separation. 

Nevertheless, the FPP — SCA methods provide a tight approximation to the upper hound irrespective 
of the number of transmit antennas. Impressively, the almost 20% of performance gains come also at 
reduced complexity. As shown in Fig. [51 the simulation time can he reduced hy even 80%, for large- 
scale antenna arrays. It should he clarified, that the simulation time figures do nof follow fhe complexify 
dependence given in Sec. IIII-CI simply because fhe considerations mentioned fherein involve worsf case 
complexify. Existing solvers employed fypically exploif fhe specific sfrucfure of mafrices fhus reducing 
fhe acfual execufion time. 


V. Conclusions 

Herein, fhe max — vain fair mulficasf mulfigroup problem under PACs is solved for large-scale anfenna 
arrays. Impressively, fhe accurate and low complexify FPP — SCA mefhods oufperform existing SDR 
based approaches bofh in terms of complexify as well as accuracy, as fhe number of fransmif antennas 
increases. Fufure extensions of fhis work involve differenl opfimizafion criferia such as fhe sum rale 
maximization as well as robusf formulations. 
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Fig. 3. Normalized simulation time for increasing co-group user angular separation. 
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